Estimating Statistics on Words Using Ambiguous Descriptions

نویسنده

  • Cyril Nicaud
چکیده

In this article we propose an alternative way to prove some recent results on statistics on words, such as the expected number of runs or the expected sum of the run exponents. Our approach consists in designing a general framework, based on the symbolic method developed in analytic combinatorics. The descriptions obtained in this framework are built in such a way that the degree of ambiguity of an object O (i.e., the number of different descriptions corresponding to O) is exactly the value of the statistic under study for O. The asymptotic estimation of the expectation is then done using classical techniques from analytic combinatorics. To show the generality of our method, we not only apply it to obtain new proofs of known results, but also extend them from the uniform distribution to any memoryless distribution. 1998 ACM Subject Classification G.2.1 Combinatorics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating the Parameters for Linking Unstandardized References with the Matrix Comparator

This paper discusses recent research on methods for estimating configuration parameters for the Matrix Comparator used for linking unstandardized or heterogeneously standardized references. The matrix comparator computes the aggregate similarity between the tokens (words) in a pair of references. The two most critical parameters for the matrix comparator for obtaining the best linking results a...

متن کامل

Solving the Polysemy Problem of Persian Words Using Mutual Information Statistics

In recent years, large monolingual, comparable and parallel corpora have played a very crucial role in solving various problems of computational linguistics including machine translation, information retrieval, natural language processing, and the like. This paper tries to solve the problem of polysemy of Persian words while translating them into Persian by the computer. We use Mutual Informati...

متن کامل

Proof nets for controlling ambiguity in natural language processing

We propose in this paper the use of two kinds of constraints in order to control the evaluation of ambiguous structures. The rst ones concern the immediate context of the words. In case of ambiguity, these constraints form a network controlling an ambiguous area. The second kind of constraints relies on descriptions of the elementary trees that can be attached to the words. Such descriptions, c...

متن کامل

Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data

We present a semi-supervised approach to improve dependency parsing accuracy by using bilexical statistics derived from auto-parsed data. The method is based on estimating the attachment potential of head-modifier words, by taking into account not only the head and modifier words themselves, but also the words surrounding the head and the modifier. When integrating the learned statistics as fea...

متن کامل

Effects of Ambiguous Gestures and Language on the Time Course of Reference Resolution

Two eye-tracking experiments investigated how and when pointing gestures and location descriptions affect target identification. The experiments investigated the effect of gestures and referring expressions on the time course of fixations to the target, using videos of human gestures and human voice, and animated gestures and synthesized speech. Ambiguous, yet informative pointing gestures elic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016